A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Gigras, Yogita
- Significance of Hybrid Feature Selection Technique for Intrusion Detection Systems
Authors
1 Department of CSE/IT, The NorthCap University, Gurugram - 122017, Haryana, IN
2 NCU University Gurugram, IN
Source
Indian Journal of Science and Technology, Vol 9, No 48 (2016), Pagination:Abstract
Objectives: Intrusion detection is the need of technical world where data is generating and changing at a very rapid rate. In last decade feature selection is the science that has given a new perspective to research in the area of Intrusion Detection System. Objective of this paper is to perform an analysis and comparison of various feature selection techniques with a new technique of hybrid Particle Swarm Optimization (PSO). Statistical Analysis: In this paper well known filter and wrapper feature selection techniques have been explored along with a hybrid PSO technique on the standard KDDCup99 dataset. A comparative analysis is performed over four filter techniques and two wrapper based techniques. Four different classifiers are compared to select the one providing good accuracy on the dataset. Findings: The hybrid PSO feature selection technique gives significant improvement in prediction capability as compared to traditional feature selection approaches. Analysis shows that SVM classifier provides better classification results. SVM is used as classifier because of its high accuracy. The analysis over 4 filter and two wrapper techniques shows that Hybrid PSO provides better results with 98.6% accuracy and 24 feature subset. Application/Improvements: Analysis provides importance of hybrid PSO, which may be applied to not only intrusion detection but also various other areas where feature reduction is required.Keywords
Binary Particle Swarm Optimization (BPSO), Hybrid PSO, Intrusion Detection System (IDS), Support Vector Machine (SVM)- An Efficient Hierarchical Clustering Technique for Medical Diagnosis Using KNN Classifier
Authors
1 Department of Computer Science, The Northcap University, IN
Source
Artificial Intelligent Systems and Machine Learning, Vol 9, No 4 (2017), Pagination: 62-69Abstract
In this research article, an intelligent hierarchal clustering technique for medical diagnosis system has been proposed. Various hierarchical clustering techniques and their variants have been very much explored in the field of machine learning. However, these techniques are deterministic, needn't bother with a determined number of clusters and are stable. But, they are not scalable for high dimensional data set due to their non-linear correlations. In this paper, a new approach is proposed for medical data classification based on hierarchical clustering. The proposed technique has the following features (i) In each cycle, rather than ascertaining the centroids for new clusters, new centroids are assessed from centroids in past cycle; and (iii) In every run, rather than combining just a single match of items, various sets are converged in the meantime.Keywords
Clustering, Hierarchical Agglomerative Clustering, K-Nearest Neighbor (KNN), Feature Selection, Filter and Wrapper Model, Medical Data.- Mining Patterns for Clustering Using Modified K-Means and SVM (Support Vector Machine)
Authors
1 NorthCap University, Gurgaon, IN
Source
Artificial Intelligent Systems and Machine Learning, Vol 9, No 5 (2017), Pagination: 87-92Abstract
Data mining can be termed as a process of extracting patterns (knowledge) and posing query from data. Stored in database. Classification is one among of its concept and techniques. This research article is proposing a novel hybrid mining approach by using modified K-Means and Support vector machine algorithm. Modified K-Means utilized here for making the clusters from given dataset and SVM is utilized for classification (on clustered dataset obtained from modified K-means clustering). Experiments are performed over different datasets which are taken from UCI repository. Datasets which are used for comparing clustering algorithm are provided in Table 1 along with their details. Evaluations are done on different datasets of following parameters: Accuracy obtained from new algorithm and confusing matrix which is being created for every dataset. Additionally, proposed algorithms provide better result than other.
Keywords
Confusion Matrix, Clustering, K-Means, Modified K-Means, SVM (Support Vector Machine).References
- A. E Gutierrez-Rodriguez, J. Fco Martinez – Trinidad, M. Garcia-Borroto, J.A. Carrasco-Oacha,” Mining patterns for clustering on numerical datasets using unsupervised decision tree”, Knowledge based systems, Pp. 70-79, 82(2015).
- R.S Michalski, R.E Stepp, “Automated constructions of classifications; conceptual clustering versus numerical taxonomy”, IEEE Trans. Pattern Anal. Machine Learn. Pp. 396-410, 5(4) (1983).
- Bing Liu, Yiyuan Xia and Philip S. Yu,” Clustering through Decision Tree Construction”, (CIKM-2000), Washington DC, USA, November 6-11, 2000.
- Daxin jiang, Jai pie (CANADA), Aidong Zhang (USA), “General approach to mining quality based clustering on microarray data “, L. Zhou, B.C. Ooi, and X. Meng (Eds.): DASFAA 2005, LNCS 3453, pp. 188–200, Springer-Verlag Berlin Heidelberg 2005.
- Vivekanathan. P,” Different data mining algorithm: A Performance Analysis” Volume 1, Issue 3, Pp. 79-84, September – October 2012.
- Muhammad Ali Masood, M. N. A. Khan, "Clustering Techniques in Bioinformatics", I.J. Modern Education and Computer Science, vol. 1. Pp. 38-46, January 2015.
- K. Hanumantha Rao, G. Srinivas, Ankam Damodhar and M. Vikas Krishna, “Implementation of Anomaly Detection Technique Using Machine Learning Algorithms", International Journal of Computer Science and Telecommunications, Volume 2, Issue 3, Pp. 25-31, June 2011.
- Sumit Garg, Arvind K. Sharma, "Near Analysis of Data Mining Techniques on Educational Dataset”, International Journal of Computer Applications (0975 – 8887), Volume 74– No.5, July 2013.
- Lior Rokach, Oded Maimon(Eds.), "Clustering Methods", Data Mining and Knowledge Discovery Handbook, XXXVI, 1383 Pp.400 illus., Hardcover ISBN:0-387-24435-2, Springer, 2005.
- Pavel Berkhin, “Survey of Clustering Data Mining Techniques”, Grouping multidimensional data-recent advances in clustering, ISBN 9873-3-540-28348-5, Pp. 25-71, Springer 2006.
- Muhammet Mustafa Ozdal, Cevdet Aykanat” Hypergraph Models and Algorithms for Data-Pattern-Based Clustering”, Data Mining and Knowledge Discovery, Kluwer Academic Publishers. Manufactured in The Netherlands 9, Pp.29–57, 2004.
- Mythilli, Madhiya, “An Analysis on Clustering Algorithms in Data Mining”. CSMC, Vol. 3, Issue. 1, Pp. 334-340, January 2014.
- Arthur Zimek, Ira Assent and Jilles Vreeken, “Frequent Pattern Mining Algorithms for Data Clustering”, DOI 10.1007/978-3-319-07821-2_16, Pp. 403-423, Springer International Publishing Switzerland 2014.
- K.Tamizharasi, Dr. UmaRani, K.Rajasekaran, “Performance analysis of various data mining algorithms”, International Journal of Computing Communication and Information System (IJCCIS), Vol.6. No-3, Pp. 118-127, July-September 2014.
- Jiangping Chen, Ting Hu, Pengling Zhang, Wenzhong Shi, “Trajectory clustering for people’s movement pattern based on crowd souring data”, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-2, 2014, ISPRS Technical Commission II Symposium, Toronto, Canada, 6 – 8 October 2014.
- P. Keerthana, P. Thamilselvan, J.G.R. Sathiaseelan, “Performance Analysis of Data Mining Algorithms for Medical Image Classification”, International Journal of Computer Science and Mobile Computing, Vol.5 Issue.3, Pp. 604-609, March- 2016.
- M.S. Mythili, A.R. Mohamed Shanavas, Ph. D, “Performance Evaluation of Apriori and FP-Growth Algorithms”, International Journal of Computer Applications (0975 – 8887) Volume 79 – No10, Pp. 34-37, October 2013.
- Logistic Regression for Breast Cancer Analysis
Authors
1 Department of Computer Science, The Northcap University, IN
Source
Data Mining and Knowledge Engineering, Vol 9, No 6 (2017), Pagination: 109-113Abstract
In this study, logistic regression on mammograms is used to diagnose breast cancer. The aim of using logistic regression is to obtain the significant clinical factors contributing more towards higher probability of breast cancer. The sample data set is taken from UC Irvine repository and modeled using the regression model. A 10-fold cross validation is applied on the training data set to avoid the over fitting problem. The sample data set contains mammograms samples collected by a survey conducted by the Radiologist. The classification table of 450 samples illustrations the correct classification percentage for mammogram as 96.6%. The result is then compared with 30 validated samples, correct classification 68.9%.The simulation results claims that the used linear regression model is able to map relationships among attributes by giving more accurate classificationKeywords
Breast Cancer, Mammograms, Prediction, Logistic Regression, Factors and Accuracy.References
- . Al-Ghamdi, A. S. Using logistic regression to estimate the influence of accident factors on accident severity. Accident Analysis & Prevention 34(6) (2002): 729-741.
- . Archer, K. J., S. Lemeshow, and Hosmer, D. W., Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design. Computational Statistics & Data Analysis 51 (9) (2007): 4450-4464.
- . P. C. and J. V. Tu, Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. Journal of Clinical Epidemiology 57(11) (2004): 1138-1146.
- . Bagley, S. C., H. White, and Golomb, B. A. Logistic regression in the medical literature: Standards for use and reporting, with particular attention to one medical domain. Journal of Clinical Epidemiology 54(10) (2001): 979-985.
- . Balleyguier, C., S. Ayadi, K. V. Nguyen, D. Vanel, C. Dromain, and R. Sigal ,BIRADS(TM) classification in mammography. European Journal of Radiology 61(2) (2007): 192-194.
- . Colditz, G. A., W. C. Willett, D. J. Hunter,M. J. Stampfer, J. E. Manson, C. H. Hennekens, B. A. Rosner, and F. E. Speizer, Family History, Age, and Risk of Breast Cancer: Prospective Data From the Nurses' Health Study. Journal of Clinical Medicine 270(3) (1993): 338-343.
- . Kamber, M., Winstone, L., Gong, W., Cheng, S., & Han, J. Generalization and decision tree induction: efficient classification in data mining. In Research Issues in Data Engineering, 1997. Proceedings. Seventh International Workshop on 1997:. 111-120.
- . Ngai, E. W., Xiu, L., & Chau, D. C. Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications, 36(2) (2009):2592-2602.
- . Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11(1) (2009):10-18.
- . Steinbach, M., Karypis, G., & Kumar, V. A comparison of document clustering techniques. In KDD workshop on text mining 400(1) (2000): 525-526.
- . Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer research, 27(2 Part 1) (1967):209-220.
- . Ng, A. Y., Jordan, M. I., & Weiss, Y. On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems (2002): 849-856.
- . Al-Hajj, M., Wicha, M. S., Benito-Hernandez, A., Morrison, S. J., & Clarke, M. F. (2003). Prospective identification of tumorigenic breast cancer cells. Proceedings of the National Academy of Sciences, 100(7) (2003):3983-3988.
- . Gunjal, B. L. Wavelet based color image watermarking scheme giving high robustness and exact corelation. International Journal of Emerging Trends in Engineering and Technology (IJETET), 1(1) (2011): 21-30.
- . Concato, J., Feinstein, A. R., & Holford, T. R. The risk of determining risk with multivariable models. Annals of internal medicine, 118(3) (1993): 201-210.
- Machine Learning Algorithm Used for Detecting Malicious PDF Document
Authors
1 Department of Computer Science, The NorthCap University, Gurugram, IN
Source
Software Engineering, Vol 10, No 4 (2018), Pagination: 61-65Abstract
In computer security field, Malware is a constancy problem and its involvement is increasing rapidly .Cyber criminals are heavily using PDF documents for launching attacks. These attacks routinely results in the loss of confidential information. Attackers attach the malicious PDF documents to emails to deliver malicious code to normal users and make use of social engineering to open the email, attachment. This article outlines machine learning based approach for differentiating between the malicious and benign PDF document by analyzing the essential differences in the structural properties of the document. We have compared the proposed system with the other machine learning classifiers over 6000 real world Benign and Malicious files. Finally, this research work provides you some machine learning technique for the detection of malicious PDF documents.
Keywords
Portable Document Format (PDF), Malicious PDF Document, Machine Learning, Malware Detection.- A Survey on Air Quality Sensing and Management System Using IOT
Authors
1 Department of Computer Science and Engineering, The North Cap University, Gurugram, Haryana, IN
Source
Artificial Intelligent Systems and Machine Learning, Vol 10, No 5 (2018), Pagination: 115-119Abstract
The level of pollution is increasing day by day by lot of factors such as population, increase in number of vehicles, industrialization and urbanization which results in harmful effect on human wellbeing by directly affecting health of population exposed to the pollution. This paper had a solution of monitoring the air and noise pollution levels in industrial environment. Technology used in this paper is internet of things. The main objective of this paper is to introduce air pollution monitoring system using internet of things and this technology is capable of detecting pollutants on roads and measure various types of pollutants in air. This paper also reports the status of air pollution in particular city along with the temperature. This system will provide a low cost solution and provides good results in controlling the air pollution especially in urban areas.
To control this harmful level of pollution there is urgent need to design a system for sensing the level of pollution region wise and provide appropriate measures to be taken at that particular level of pollution.
Keywords
Internet of Things (IoT), CO Sensor, CO2 Sensor, Temperature and Humidity Sensor, Air Pollution, Arduino Microcontroller, Wi-Fi Module, LCD Display, Android, PM Levels.References
- C. Xiaojun, L.Xianpeng, X. Peng, “IOT-Based Air Pollution Monitoring and Forecasting System”, In Proc. of 2015 International Conference on Computer and Computational Sciences (ICCCS), pp. 257-260, 2015.
- H.H. Dholakia, P.Purohit, S. Rao, A. Garg, “Impact of current policies on future air quality and health outcomes in Delhi, India”, Atmospheric environment, vol.75, pp.241-248, 2013.
- C.V. Saikumar, M. Reji, P. C. Kishoreraja, “IOT Based Air Quality Monitoring System”, International Journal of Pure and Applied Mathematics,vol.117,pp. 53-57,2017.
- P. Poonam, G. Ritik, T. Sanjana, S. Ashutosh, "IOT Based Air Pollution Monitoring System Using Arduino", International Research Journal of Engineering and Technology (IRJET), vol.04 , pp .1137-1140 , 2017.
- C. Balasubramaniyan, D. Manivannan. "IOT Enabled Air Quality Monitoring System (AQMS) using Raspberry Pi", Indian Journal of Science and Technology, vol. 9, 2016.
- N. Riteeka, R. P. Malaya, K .R. Vivek, T. Appa Rao. "IOT Based Air Pollution Monitoring System", Imperial Journal of Interdisciplinary Research, vol-3, no.4, pp.571-575, 2017.
- C.C. Judith, G.W. John, P.C. Lyle, P.R. William , F.A. Clifton , P.G. Richard, “The DRI thermal/optical reflectance carbon analysis system: description, evaluation and applications in US air quality studies”, Atmospheric Environment, General Topics, vol. 27A, pp. 1185-1201, 1993.
- C. A. McHugh, D. J.Carruthers, & H. A. Edmunds. "ADMS–Urban: An Air Quality Management System for Traffic, Domestic and Industrial Pollution.” International Journal of Environment and Pollution, vol. 3-6, pp. 666-674, 1997.
- G. Barrenetxea, F.Ingelrest, G. Schaefer, M. Vetterli, O.Couach, M. Parlange, “SensorScope: Out-of-the-Box Environmental Monitoring.” In Proc. of the 7th international conference on Information processing in sensor networks, pp. 332-343, 2008.
- P. Yaswanth, “ An IOT Based Automated Noise and Air Pollution Monitoring System”, International Journal of Advanced Research in Computer and Communication Engineering, vol.6, no. 3, pp. 419–423, 2017
- P.Sai Chandana, K.Sreelekha, A.Muni Likith Reddy, M.Anil Kumar Reddy, R.Senthamilselvan , “IOT Air And Sound Pollution Monitoring System”, International Journal on Applications in Engineering and Technology, vol.3, no.1, pp.18 – 21,2017.
- R. Joao, and M. Joao Dias, "Analyses of Indoor Environmental Quality and Ventilation in a Building of a School of Higher Education in Portugal", In Proc. of ARSA-Advanced Research in Scientific, vol.4, pp.526-537, 2015.
- N. Kaur, R. Mahajan , D. Bagai, “Air Quality Monitoring System based on Arduino Microcontroller”, International Journal Innovative Research in Science, Engineering and Technology (IJIRSET), vol.3, 2016.
- Zimmer, C. E., & G. J. Nehls, “The Impact of computers upon air pollution research”, Journal of the Air Pollution Control Association, vol. 18, no.6, pp.383-386, 1968.
- E.F.Spencer J.R., “Designing for Air Pollution Control. Journal of the Air Pollution Control”, Journal of the Air Pollution Control Association, vol.18, no.6, pp.411-413, 2012.
- J.Rinki, P. Karnika, “Air Pollution and Health. Discussion Paper” The Energy and Resources Institute: New Delhi, TERI, pp.1-26, 2015.
- https://countercurrents.org/2017/07/19/a-comprehensive-study-of-air-pollution-in-india/.